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This paper deals with the problem of information representation into a form that allows to make 
associations, measure similarity and integrate new information with respect to previously stored. Several 
simple models for encoding information into sparse distributed representation are explored. These models 
based on the idea that information about stimuli is stored in the population, not an individual neuron, thus 
each neuron learns many partial features. Results show formation of a sparse representation of image data 
with high overlap for similar images. Each cell develops multiple receptive fields that together create a 
population receptive field. It was possible due to incorporation of dendritic tree into standard neuron model. 
Also, models were tested on a classification of handwritten digits from MNIST dataset. Results from 
unsupervised representation show poor accuracy compared to the state-of-the-art supervised methods, 
however, due to the presence of interesting properties further development of an idea should be continued. 

Keywords: dendritic computation, sparse representation, sparse coding, unsupervised learning. 

CratTa posrmajae upoOmemy mpesctaBieHHa indopmawii y opMi, aKa JO3BONIde CTBOpIOBaTH 
acolialii, BHMIpIOBaTH CXOXKICTb Ta iHTerpyBaTH HOBY I1H(OpMalliioO BiQHOCHO paHilie 30epexkeHoi. 
JlocmiwxKyloTbea eKIbKa TIPOCTHX MOJeeH WIA KOAYBAaHHA iHopMallli y po3spiKeHO posnozIeHOoMy 
mipeyctapenui. Moyemi rpyHTytoTecd Ha ifei, Wo IHPOpMaLlid Mpo CTHMysIM 30epira€TECA B MOMyIALII, a He 
B OKpeMoMy HelfpoHi, TOMY KO2KeH HelpoH HaBYaeTBCA Ha OaraToO YaCTKOBHX O3HaK. Pe3sybTaTH MOKa3YIOTb 
(opMyBaHHA po3piypKeHOrO MpescTaBIeHHA 300paxKeHHA 3 BHCOKMM epeKPHTITAM IA NMORIOHUX 
300paxeHb. KoxkHa KIITHHa (POpMYe KIJIbKa PeLeMTHBHUX TOJIB, AKI pa3OM YTBOPIOIOTh MOMMyIALIMHe 
pelentHBHe nome. He cTaso0 MOXIMBUM 3aBJAKH BKJIIOUCHHIO J@HPHTHOTO JepeBa B CTaHapTHY MOJeJIb 
Helipoua. Tako Mojesi OyIM MepeBipeHi Ha 30aTHICTb JO KNacudikali pykouucHUx yup 3 Habopy WaHHXx 
MNIST. Pesynptaru Id HaByaHHa 6e3 yYHTeIA MalOTb MoraHy TOYHICTb y HOpiBHAHHI 3 CcyyacCHHMUu 
MeTOJaMH JI HaBYaHHAM 3 YYHTeICM, OJHaK 3aBUAKM HaABHOCTi WiKaBHX BJIaCTHBOCTeH MoOWabuIui 
PO3BHTOK igei Mae OyTH IpoAOBxKeHUH. 

Ksroy0Bi copa: JeHApuTHi OO4HCIeHHA, po3piMKeHe MpeAcTaBMeHHA, pospisyKeHe KOZyBaHHA, 
HaByaHHA 6e3 y4anTeA. 


Introduction 

One of the main problem facing before an intelligent machines creation is finding of a 
correct substrate of memory [1]. The question how to encode information and in which form 
it should be stored for efficient further processing remains unanswered. A good candidate for 
such substrate is hyperdimensional binary vectors [2], [3]. Vectors with dimensions at the 
order of 1000 provide a good framework of how to associate, compare and bind information 
of different objects. However, an open issue remains how to form such binary vectors from 
real-world raw data, such as intensities of pixels, words, and sounds [4]. 

The standard way to encode something is to create a dictionary, correspondence 
between feature and its binary code, like latter “A” encoded with 1000001 according to 
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ASCIL This strategy shaped our modern computer science and even was applied to 
computer vision problems when a set of features, like the shape of an eye, were 
handcrafted and represented with appropriate binary code. However, biological organisms 
do not have specified dictionary, they create an internal representation of an environment 
by themselves through self-organizing neural networks. If we want to create systems with 
artificial intelligence, we should abandon specified dictionaries by humans and provide an 
ability to self-generate code of the world. 

Recently, it became possible to learn features from raw sensory data by using artificial 
neural networks through supervised learning. Deep convolutional networks use small kernels 
for the lower layers that through backpropagation become feature detectors [5]. However, 
these systems as well require humans to specify dataset and correct labels in contrast to 
biological neural networks that use a mixture of an unsupervised and reinforced learning. 

Another prominent result came from the field of sparse coding. It was shown that 
under a constraint to reliably reconstruct an input image with using a small part of a 
dictionary system forms features that resemble receptive fields of simple cells in a visual 
cortex [6]. Later, more works appeared that provided framework to form biological 
plausible features and adopting a strategy of sparse coding [7]-[9]. There are two main 
problems with this approach. The first is that it requires solving the optimization problem 
in order to generate features in contrast to biological organisms that uses self-organization. 
This leads to complications in implementing online learning algorithms for practical 
application in robotics, though recently there was a progress in this direction [10]. The 
second is that input is reconstructed using real-valued coefficients, thus it restricts to use 
framework for hyperdimensional vectors [3]. 

The closest models to form desired representation were developed by Foldiak [11] 
and Numenta team[12]. They achieved forming a sparse binary representation of an input 
with feature learning and homeostatic principles. Nevertheless, in these models cells learn 
to represent a limited set of features, thus it limits representation capacity of the network. 

The goal of this work is to test simple models that form a sparse representation of 
image data based on a dendritic neuron model. These models enable to encode multiple 
features by a single neuron that increases a capacity of the network and an information is 
spread across a population of neurons. Also, I provide biological background behind an 
idea to include dendrites. Models were developed in order to satisfy requirements to work 
online, thus excluding solving an optimization problem and to provide a large capacity 
through spread features across a population. 

The results show the reconstruction of an image into a sparse distributed 
representation with high overlap for similar images. Each cell forms multiple receptive 
fields and together with other cells form population receptive field. However, unsupervised 
models show poor classification accuracy (0.7) compared to supervised state-of-the-art 
methods (0.99). This could be due to bad feature extraction; thus it works similarly to 
comparison to mean image. Despite, that presented results are worse than existing, 
proposed ideas are not fully investigated and further research to be conducted. 

The following paper structured as follows. Next, I provide motivation of the work 
and its place in a global context. In the “Methods” section, I describe computational 
models that were used. In “Biological background”, I provide computational properties of 
the dendritic tree, stressing that standard model of an artificial neuron should be extended. 
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In “Result” section, I provide figures of pattern representation, patterns overlap and results 
of classification accuracy for MNIST dataset for different models. In the end, I discuss 
possible reasons for a poor accuracy and future directions of a research. 

Motivation 

The first and important step that biological organisms do is making a representation 
of raw sensory data from an environment into the form of sparse distributed representation. 
Recreating this in algorithms will allow to apply efficient association learning and to use 
properties of high-dimensional binary vectors. In this form, it will be convenient to check 
information similarity simply by computing hamming distance or, in other words, patterns 
overlap. As well it will be possible to make sequential memory and to experiment with an 
action-perception loop. It will be feasible to use reinforcement learning to select right 
action through sculpturing patterns of activation and efficiently predict an outcome of a 
planned action. In order to use all of these in real time robotics, it is necessary to develop 
an algorithm for sparse coding that works online and that are adaptable to a new incoming 
stream of data. 

Methods 

Handwritten digits from MNIST database were used as an input with applying 
thresholding and binarizing. As a result, were obtained binary vector * of size 28*28=784. 

Representation layer ¥ was initialized with size nxn and random binary weights to 
input layer. Two connection schemes were used, one with random connection all-to-all 
with certain percentage g of connected weights (w; = 1), another, with local connections 
topographically projected from input to representation layer (Fig.l). Activation is 


calculated according to: 
y; =) wy t4) 9(). Ci.j%;) i (1) 
i k i 


Where g(z) = {1, if z > 6, and 0,otherwise}, @ is a threshold for activation from 
clustered synapses, A - constant for regulation of an influence of dendrites, and ¢,; stores 


synapses in clusters. After ¥ is computed was performed k-WTA operation, when k highest 
values of ¥ were set to one and other to zero. Typically value of k was set small 
k =0.1||y|| compare to size of y in order to achieve sparse representation. Result was a 
sparse binary vector y. 

Learning procedure realizes through learning clusters. Pseudocode is presented below. 


for x in images 
for iin y size 
for j in x size 
# vetrieval of activation 
yi=Wiij*x j 
for k in y clusters 
yi = yi + g(sum_j(c_ijk*x_j)) 
y = k_WTA(y) 
# learning clusters 
for i in nonzero(y) 
k = number of stored clusters + 1 


cluster = select random S indices of active x_j 
c ijk = 1 where j in cluster 
return y 
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Sparse distributed representation 


Input image 
Fig.1. Schematic image of connections 


After obtaining a representation for every image it is possible to count pattern 
overlap between for different representations. Pattern overlap for two binary vectors we 
define as Hamming distance. 

To calculate accuracy every representation of test image was compared to mean 
pattern for all training images from the same class. The highest overlap of patterns 
determines to what class belongs representation for the test image. 

Biological Background 

The main idea of this paper is to include dendritic computation into neuron model 
and to try to receive a sparse representation of raw sensory input based on this model. First 
artificial neural networks were inspired by knowledge from biology and were based on 
information available 70 years ago. Now we know much more about real neuron 
functioning and today we acknowledge the importance of dendritic tree. Synapse from 
thousands of neurons terminated on a vast dendritic tree of a single neuron and it is very 
important to which part of dendrite each synapse is connected and what are neighboring 
synapses. Presence of dendrites makes possible to integrate input not just linearly, as it was 
previously assumed, but supra- and sub-linearly (Fig.2B) [13]—[15]. Thus, a small amount 
of active neighboring synapses could elicit dendritic spike and depolarize the cell much 
larger than if these synapses were distributed on different branches of dendrites (Fig.2A) 
[16], [17]. Such close synapses form clusters and the possible computational role for this is 
to track coincidence of an activation of particular neurons [18]. Every neuron has a lot of 
such clusters thus neuron works as a multiple feature detector. This is in good 
correspondence with population coding, where information encoded not in individual 
neurons but shared across the population. Every neuron can take part in different 
populations thus it needs to learn connections to many populations and clustered synapses 
on dendrites makes it possible. This differs from classical Hebbian learning where 
connection increases or decreases between two neurons and tracks pair-wise correlation. In 
this approach, neuron learns higher-order correlations and connections occur not between 
individual neurons, but between populations. 
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Fig. 2. A) The depiction of distributed synapses on the left and clustered on the 
right side. Modified from [19] B) Difference between 
linear and supralinear summation. 


In this paper, I try to simulate this ability of a neuron to learn higher-order 
correlations and to be sensitive to multiple features. It was shown that such ability 
increases the capacity of association memory (results in publishing), but here I check if it 
helps with the sparse representation of a sensory data. 

Results 

Proposed methods are capable to generate sparse binary vectors from image due to 
applying k-WTA. In order to see if similar patterns generate similar code pattern overlap 
was computed. On Fig.3A presented results for pattern overlap for four different digits, the 
darker the higher overlap. The figure shows that digits from same class have higher 
overlap, however, this distinction is not totally clear. There are instances that have high 
overlap from different classes. This interclass high overlap was the main reason for bad 
classification accuracy, that will be described later. 

Next, the idea of information spread across population was tested. On Fig.4A 
presented collection of receptive fields of encoding cells and on the right overall receptive 
field of a population. On Fig.3B presented more images of population receptive fields, 
which was computed as the linear sum of receptive fields of individual neurons. 


Fig. 3. A) Overlap of patterns from different images. Black shows high overlap. 
Visible black squares represent overlap for patterns from similar images 
from one class. B) Lower: binary images from handwritten digits. 
Upper: receptive field of an entire population 
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The main deviation of proposed models from the standard is that each neuron has 
multiple receptive fields computed by dendrites. On Fig.4B presented such collection of 
receptive fields of single cell and on the right combined total receptive field. Importantly, 
that these receptive fields determine activation not through linear summation, but as a 
threshold @ for coincident detection. This result is very similar to coarse coding [20], [21], 
where cells have very wide or coarse receptive field but on the level of population it is 
possible to decode precise stimulus. 


iid 2 oe J 


l 2 3 4 5 6 7 8 combined 


Fig. 4. A) Image and receptive fields of individual cells that encode digit. Receptive 
field took from activated clusters. B) The receptive field of individual clusters for 
one cell and combined receptive field. It looks like coarse coding. 


Also, there were performed tests for classification accuracy. For all images from the 
same class was computed average representation vector. It was done for every class and as 
result 10 mean representation vectors were obtained. Then, representation for every image 
from test set was compared to each mean representation by calculating overlap. The 
highest overlap determined the recognized class. To test accuracy were used a different 
configuration of a model: with random connections, with localized connections, with 
clusters of activation that produce multiple receptive field and model without clusters, 
merely linear activation, and k-WTA. All four showed similar results with accuracy near 
0.7. The same accuracy could be obtained just by comparing the image to mean images 
without any representation into a binary vector. This tells that features extracted for binary 
representation is not properly learned and representation works just like image transformed 
into a different form. 

Conclusions 

This paper deals with the problem of information representation into sparse 
distributed representation in order to use hyper dimensional computing framework. 
Existing solutions do not fully satisfy all requirements, they lack online learning, or use 
real values, or they have low capacity. The proposed idea is to use the extended model of a 
neuron that includes dendritic computation to achieve sparse data representation. It is 
assumed that the goal of dendrites is to track coincidence in incoming stimuli, not merely 
linear summation. This allows to be responsive to many features that is crucial for having a 
large capacity of the network. 

Proposed models form sparse representation with high overlap for similar images. 
Also, cells were able to form many partial receptive fields using dendrites. Information about 
the image was spread across the population that forms combined receptive field. 
Furthermore, each cell forms many receptive fields that together form much wider field. This 
relates to an idea of coarse coding presented 30 years ago [21] but was not elaborated further. 
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Achieved result for classification accuracy with unsupervised representation is near 
0.7. This is significantly worse than state-of-the-art learning algorithms with accuracy 
more than 0.99. Even simple KNN algorithm gives more than 95% of correct predictions. 
However, presented idea shows interesting properties like population receptive field, 
coarse coding, multiple receptive fields for a neuron and worth to be developed further. 
The reason why accuracy is low could be an absence of inhibition, thus receptive fields are 
all positive that leads to false activation. Also, it is possible that to receive higher accuracy 
it is not sufficient to compare representation from the first layer, maybe hierarchical 
architecture will form more stable representation for similar inputs. 

Overall, it forms desired properties: neuron works as multiple pattern detector, information 
about an input is encoded into the whole population and it creates sparse code. However, there is 
a high overlap between patterns from different classes that limits classification accuracy and 
suggests bad feature extraction. More research is needed in this direction. 
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PE3IOME 

B.M. Ocay.1eHko 

TecryBaHHaA MpocTHx MOjeJeH Heipona 3 AeHIpHTaMH JIA po3spisKeHOrO 
OiHapHOro NpeAcTaBJeHHA 3006paxKeHHA 

B qaniii poOoTi po3rmaqaioTbca MOjei HeiipoHa 3 ypaxyBaHHAM JIeHAPHTHOTO 
jepesa. Ile MOTHBOBaHO HellOAaBHIMU JOCIIPKeHHAMM poOoTH OiosoriMHoro HeiipoHa Ta 
OOUNCIIOBAIBHUX BIACTHBOCTeH JeCHAPUHTHOFO WepeBa, Je 1OKa3aHo, WO HeipoH B WiIOMy 
Ma€ 3Ha4HO OiIbITy OOUMCIIOBAIBHY 30 aTHICTh HK BBaxKaslocd paHiwe. B KomOinalii 3 
ieclO PO3P12KeCHOTO KOYBaHHA MOKa3aHO MOXKJIMBICTb HelipoHa dopMyBaTH OaraTo 
pellenTHBHUX TOJB Ha OCHOBI 300paxKeHb pyKOMMcHHx Wndp. KoxHe pelenTHBHe Tose 
3allHCy€TbCA Ha OKPCMOMY JICHPHTHOMY CerMeHTi 1 30epirae 4YaCTKOBI PHCH BXIJHUX 
maHux. Xoua i 00’ eqHaHe pellelITHBHe Tose OMHOTO HelipoHa He Mae CeJIEKTMBHOCTI, ale 
BOHa BHHMKa€ Ha PiBHi MOMyAUI, WO WOOpe y3roJKyeTbcCA 3 ieero TpyOoro KOyBaHHA. 
TakuM 4HHOM 300paxKeHHA OysIH 3aKOOBaHI y po3pi%KeHe TpecTaBIeHHA HelipoHHoi 
MepexKl, Te KOKeH HelpoH HaByeHHii Ha WMpoKHi cieKTp cTuMysiB. Takox%x OysI0 
BHMpoOyBaHO e@eKTHBHICTh KOJyBaHHA B 3afayi Klacn@ikalli pyKONHCHHXx wWHdpp. 
JlocarHyTa TOYHICTh MeHia HK Y 1HWHX MeTOIB Ha OCHOBI HaBYaHHA 3 YUHTeJIeM, LO 
CBIJYHTb pO HeOOxifHIcTb ado 301IbIeHHA PO3MiIpy pellelTHBHOTO MoOsIA HelpoHa 3 
OWaBaHHA TPHTHiyyrounx HeipoHiB, a00 TOAaBaHHA HOBHX WapiB Helipouis. PospisxKeHe 
KOJyBaHHA 3 YpaxXyBaHHAM JIeHPMTIB Mae OiIbUT OIONOriMHy peasicTH4HICTh Ta 
OOUNCIIOBAIBHYy WepeBary, Tak AK a€ 3MOry 3MCHIUMTH KIJIbKICTb Helpouis. 3MeHIIeCHHA 
KUIbKOCTI O1ONOTiYHHX HeMpOHiB B MO3KY JIFOMMHH BHUHWKa€ B HaciiOK OMTHMi3allii 
pecypciB, Ta 3MeCHLICHHA HelpoHiB B MOJ{eJIl Mae TepeBary y OMTHMi3allii pecypciB y 
BHMaKy IMIMIeMeHTAaLii aIrOpHTMy B eJIEKTPOHHUX TIpwiayax. 
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